Pietro Pollo, Szymon M. Drobniak, Hamed Haselimashhadi, Malgorzata Lagisz, Ayumi Mizuno, Laura A. B. Wilson, Daniel W. A. Noble, Shinichi Nakagawa
Published
January 8, 2026
1 Update
We will update this tutorial when necessary. Readers can access the latest version in our GitHub repository.
If you have any questions, errors or bug reports, please contact Pietro Pollo (pietro_pollo@hotmail.com) or Shinichi Nakagawa (snakagaw@ualberta.ca).
2 Introduction
This online material is a supplement to our paper “Beyond sex differences in mean: meta-analysis of differences in skewness, kurtosis, and correlation”. You will see how to calculate the new effect size statistics we have proposed and how to use them in a meta-analytical model using the metafor package in R.
3 Content
In this online material, we provide details on the simulations we ran to evaluate the effectiveness of our proposed effect size statistics (and their associated sampling error). We also show how to (1) calculate our newly proposed effect sizes (\(\Delta sk\), \(\Delta ku\), \(\Delta Zr\)) and (2) exemplify their use with data from the International Mouse Phenotyping Consortium.
4 Prerequisites
4.1 Loading packages
Our tutorial uses R statistical software and existing R packages, which you will first need to download and install.
If the packages are archived in CRAN, use install.packages() to install them. For example, to install the metafor , you can execute install.packages("metafor") in the console (bottom left pane of R Studio).
Version information of each package is listed at the end of this tutorial.
We also provide some additional helper functions to calculate effect sizes, process data, and visualise our results. The most straightforward way to use these custom functions is to run the code chunk below. Alternatively, paste the code into the console and hit Enter to have R ‘learn’ these custom functions.
If you want to use these custom functions in your own data, you will need to change the variable names according to your own data (check out the R code and you will see what we mean).
We conducted Monte-Carlo simulations to evaluate bias and variance estimation for our new effect sizes \(\Delta sk\), \(\Delta ku\), and \(\Delta Zr\). For \(\Delta sk\) and \(\Delta ku\) we simulated independent samples for two groups from Pearson distributions with known moments using the rpearson function from the PearsonDS R package (vers. 1.3.2, Becker and Klößner 2025). We conducted two simulations: 1) first by changing skewness between groups that involved moderate departures from normality (group-specific skewness, \(sk \in \{-1, -0.5, 0, 0.5, 1\}\) with kurtosis fixed at 3) and 2) by holding skewness constant (\(sk\) = 0) while manipulating kurtosis, \(ku \in \{2.5, 3, 4, 5, 6\}\). In all cases, we simulated scenarios where: (i) the variance between each group was the same (\(\sigma^2_{2}\) = \(\sigma^2_{1}\) = 1) or different (\(2\sigma^2_{2}\) versus \(\sigma^2_{1}\)); (ii) the mean between the two groups was the same (\(\mu_{2}\) = \(\mu_{1}\) = 0) or different (\(\mu_{2}\) = 5, \(\mu_{1}\) = 0). For simplicity we assumed equal sample sizes between groups with sample size varying from \(n \in \{10, 20, \dots, 100, 150, 500\}\). We created all unique combinations of the above scenarios resulting in 1200 independent scenarios (when considering each of the 100 scenarios at each sample size, see examples in Section 9.2 and Section 9.3). We estimated \(\Delta sk\) and \(\Delta ku\) for each scenario using formulas for within-group sample skewness with small-sample correction (Eq. 1 in main manuscript) and excess kurtosis with small-sample correction (Eq. 3 in main manuscript) to estimate point estimates. To estimate associated sampling variance for \(\Delta sk\) and \(\Delta ku\) we used the analytical variance estimators derived here and an associated re-sampling (jackknife) approach to compute group sampling variances separately followed by pooling. Importantly, our simulations assume no correlation between groups.
For \(\Delta Zr\) simulations, we simulated two groups each containing two variables with known correlations within each group. For \(\Delta Zr\) we drew bivariate normal data with target within-group correlations \(r \in \{-0.8, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8\}\) using the mvnorm function in the MASS package of R (vers. 7.3.61, Venables and Ripley 2002). Marginals were standard normal and group sizes varied from \(n \in \{10, 20, \dots, 100, 150, 500\}\). Again, we created all unique combinations of scenarios resulting in 768 unique scenarios. We estimated \(\Delta Zr\) using Fisher’s Z transformation \(Zr\) and calculating \(\Delta Zr\) as the difference of \(Zr\) across groups (Eqs. 9–11 in main manuscript). Sampling variance for \(\Delta Zr\) was approximated used the standard analytic variance \(\frac{1}{n-3}\) per group (summed; Eq. 10 in main manuscript) and a jackknife approach. Again, we assumed no correlation between our groups.
Across all simulations we conducted 2,500 replicates of each scenario. Performance metrics were (a) bias of the point estimator (mean estimate minus truth) (Equation 1), (b) relative bias of the sampling-variance estimator (Equation 2), (c) coverage (95%) and (d) Monte-Carlo standard errors (MCSEs) (Equation 3).
where \(\theta_{mu}\) is the true overall mean effect (i.e., the true \(\Delta sk\), \(\Delta ku\), and \(\Delta Zr\) between groups), and \(\hat{\theta}_i\) is the effect estimate from simulation \(i\).
We calcuated coverage as the proportion of 95% confidence intervals (CIs) that contained the true effect size across all simulations for each scenario.
To understand bias in the sampling variance, we computed the relative bias for the effect measure as:
where \(\hat{\theta}^2\) is the estimated sampling variance for the effect measure and \(\mathbb{E}[\hat{SV}]\) is the expected value of the sampling variance. Relative bias can be calculated using different combinations of estimates. For example, we can use \(\hat{\theta}^2\) as the demoninator from a point estimate taken from the small sample correction for \(\Delta sk\), \(\Delta ku\), or \(\Delta Zr\) or from the jackknife estimate. Alternatively, we can use the jackknife estimate for \(\mathbb{E}[\hat{SV}]\) across each simulation. We created each of the four combinations for our performance measures. For each of our performance measures we also computed the Monte Carlo Standard Errors (MCSE) of the estimated bias with
where \(S^2_{\hat{\theta}}\) is the sample variance of the estimated effects across simulations and \(n_{\text{sim}}\) is the number of simulations.
5.2 Extended simulation results
In all cases, we found the Monte Carlo Sampling Error (MCSEs) to be low for all our performance metrics (\(\Delta sk\): range of MCSEs, 0 to 0.01; \(\Delta ku\): range of MCSEs, 0 to 0.6; \(\Delta Zr\): range of MCSEs, 0 to 0.004).
5.2.0.1 Evaluating \(\Delta sk\), \(\Delta ku\), and \(\Delta Zr\) point estimators
Across all simulations, \(\Delta sk\), \(\Delta ku\), and \(\Delta Zr\) point estimators exhibited small sample bias with less than 20-30 samples, except for \(\Delta ku\), which continued to show this bias through to n ~ 50-60, indicating effect sizes involving kurtosis are more challenging to estimate (Fig. S 1). Regardless, small sample biases were moderate, and there was rarely a consistent over or under-estimation in point estimates across the scenarios evaluated (Fig. S 1).
As expected, both the analytical and jackknife estimates for \(\Delta sk\) and \(\Delta ku\) had minimal bias when distributions were normally distributed (Skewness of both groups = 0; Analytical bias= 0.001 with SD = 0.009 & Jackknife = 0.002 with SD = 0.011; Kurtosis of both groups = 3; Analytical bias=-0.005 with SD = 0.025 & Jackknife = -0.006 with SD = 0.031), but bias in both seemed to change in complex ways that suggest challenges in estimating moments (Fig. S 2). This did not appear to be related to differences in group means or variances as bias patterns were similar across these conditions (Fig. S 3), but rather the sampling distributions being skewed (see “Coverage” below).
5.2.0.1.1 Fig. S1
Figure S 1— Bias in \(\Delta sk\), \(\Delta ku\) and \(\Delta Zr\) effect estimates across simulations where samples ranged in group sample sizes between \(n \in \{10, 20, \dots, 100, 150, 500\}\). A total of 100 simulated scenarios were assessed for \(\Delta sk\) and \(\Delta ku\) whereas 64 simulated scenarios were assessed for \(\Delta Zr\).For each scenario we ran 2,500 simulations.
Importantly, bias-corrected jackknife estimates reduced the small-sample bias relative to analytical bias corrected-moment estimators (Fig. S 1; \(\Delta sk\) Mean Square Bias (MSB): Jackknife: 1.109, Analytical: 3.375; \(\Delta ku\) Mean Square Bias (MSB): Jackknife: 473.737, Analytical: 886.836; \(\Delta cor\) Mean Square Bias (MSB): Jackknife: 72.733, Analytical: 535.089).
5.2.0.1.2 Fig. S2
Figure S 2— Bias of analytical point estimators in relation to the absolute difference in skewness and kurtosis between groups. A) skewness and b) kurtosis. Colour of points correspond to the sample size and each point is a single simulated scenario. The dotted line is the zero bias line.
5.2.0.1.3 Fig. S3
Figure S 3— Bias for \(\Delta sk\) and \(\Delta ku\) for simulated scenarios was not related to group means or variances being different. For each scenario we ran 2,500 simulations.
5.2.0.2 Evaluating sampling variance estimators for \(\Delta sk\), \(\Delta ku\), and \(\Delta Zr\)
In contrast to point estimators, the effectiveness of sampling variance estimators for \(\Delta sk\), \(\Delta ku\), and \(\Delta Zr\) varied. Analytical sampling variance formulas for \(\Delta sk\) and \(\Delta ku\) were consistently biased (Fig. S 4). Jackknife resampling, when combined with analytical point estimates (Fig. S 4D & Fig. S 4E), improved performance. Under these conditions, estimators performed well when n > 50. In contrast, the performance of sampling variance estimators for \(\Delta Zr\) was best when using the analytical formulas for both the point estimator and it’s associated sampling variance (Fig. S 4I).
5.2.0.2.1 Fig. S4
Figure S 4— Relative Bias in \(\Delta sk\), \(\Delta ku\) and \(\Delta Zr\) effect estimates across simulations where samples ranged in group sample sizes between \(n \in \{10, 20, \dots, 100, 150, 500\}\). A total of 100 simulated scenarios were assessed for \(\Delta sk\) and \(\Delta ku\) whereas 64 simulated scenarios were assessed for \(\Delta Zr\). Note that for relative bias different combinations of point estimates and sampling variance estimates were used in their calculation as indicated in their titles which show the calculation. Notation is as follows ku and sk are the skewness and kurtosis calculated using original formulas. sk_sv and ku_sv are the sampling variance estimates using the original formulas. jack_skew_sv and jack_ku_sv are the sampling variance estimates for skewness and kurtosis using jackknife. jack_skew_bc and jack_ku_bc are the bias corrected point estimates from the jackknife. For each scenario we ran 2,500 simulations.
5.2.0.3 Coverage of 95% confidence intervals
Coverage was generally close to nominal levels (95%) for \(\Delta sk\) and \(\Delta Zr\), but slighly poorer when sample sizes were small (< n = 20) (Fig. S 5). For \(\Delta Zr\), coverage was close to nominal levels across all sample sizes when using the analytical sampling variance estimator (Fig. S 5). However, coverage was poor for \(\Delta ku\) across many scenarios no matter what estimator was used (Fig. S 5). Unintuitively, coverage for \(\Delta ku\) was often worse as sample sizes were larger (Fig. S 5), driven by the fact that point estimates were more accurately estimated with large sample sizes, but the sampling distribution became highly skewed, impacting coverae (Fig. S 6). At small sample sizes \(\Delta ku\) was estimated poorly when true \(\Delta ku\) was high, leading to non-skewed distributions with good coverage. In contrast, large sample sizes improved point estimation of \(\Delta ku\) when true differences existed, but the sampling distribution became highly skewed leading to poor coverage (Fig. S 6).
5.2.0.3.1 Fig. S5
Figure S 5— Coverage of 95% confidence intervals for \(\Delta sk\), \(\Delta ku\) and \(\Delta Zr\) effect estimates across simulations where samples ranged in group sample sizes between \(n \in \{10, 20, \dots, 100, 150, 500\}\). A total of 100 simulated scenarios were assessed for \(\Delta sk\) and \(\Delta ku\) whereas 64 simulated scenarios were assessed for \(\Delta Zr\). For each scenario we ran 2,500 simulations.
5.2.0.3.2 Fig. S6
Figure S 6— Example sampling distributions of three different scenerios (\(\Delta ku\) = 0, 1 or 2.5) for n = 10 and n = 500 samples for each group. For each scenario we ran 2,500 simulations.
5.3 Summary
In light of these simulation results, we suggest pairing the formula-based point estimators for skewness (Eq. 1 in main manuscript) and kurtosis (Eq. 3 in main manuscript) with jackknife standard errors for \(\Delta sk\) and \(\Delta ku\). For \(\Delta Zr\), the standard analytic variance is recommended (Eqns. 9-12 in main manuscript). This choice balances efficiency under normality with robustness to realistic deviations from it, and aligns with our broader guidance to avoid very small group sizes for these statistics. Given the challenges in estimating \(\Delta ku\), and the poor properties of its sampling variance, we recommend weighted meta-analytic models using sample size instead of sampling variance.
6 Equations and custom functions to calculate effect sizes
We then use the data from multiple phenotyping centres and mice strains to calculate average effect sizes (\(\Delta sk\), \(\Delta ku\), and \(\Delta Zr\)).
map2_dfr(.x =c("fat_mass","glucose"),.y =c("heart_weight","total_cholesterol"),.f = process.cor_effects) %>%mutate(relationship =rep(c("fat mass and heart weight","glucose and total cholesterol"),each =7)) %>%relocate(relationship) %>%datatable(.,extensions ="Buttons",rownames =FALSE)## Warning: Using an external vector in selections was deprecated in tidyselect 1.1.0.## ℹ Please use `all_of()` or `any_of()` instead.## # Was:## data %>% select(chosen_trait_2)## ## # Now:## data %>% select(all_of(chosen_trait_2))## ## See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
---title: "ONLINE SUPPLEMENT - Beyond sex differences in the mean: new approaches to meta-analyse differences in skewness, kurtosis, and correlation"author: "**Pietro Pollo, Szymon M. Drobniak, Hamed Haselimashhadi, Malgorzata Lagisz, Ayumi Mizuno, Laura A. B. Wilson, Daniel W. A. Noble, Shinichi Nakagawa**"date: "`r Sys.Date()`"format: html: toc: true toc-location: left toc-depth: 3 toc-title: "**Table of Contents**" output-file: "index.html" theme: simplex embed-resources: true code-fold: show code-tools: true number-sections: true bibliography: ref.bib fontsize: "12" max-width: "10" code-overflow: wrap page-layout: full fig-numbering: falsecrossref: fig-title: Figure S tbl-title: Table title-delim: "—" fig-prefix: "Fig. S" tbl-prefix: "Tab."editor_options: chunk_output_type: console---```{r setup}#| include: falseknitr::opts_chunk$set(collapse =TRUE,message =FALSE,warnings =FALSE,echo = TRUE#,#comment = "#>")```# UpdateWe will update this tutorial when necessary. Readers can access the latest version in our [GitHub repository](https://github.com/pietropollo/new_effect_size_statistics).If you have any questions, errors or bug reports, please contact Pietro Pollo (pietro_pollo\@hotmail.com) or Shinichi Nakagawa (snakagaw\@ualberta.ca).# IntroductionThis online material is a supplement to our paper "Beyond sex differences in mean: meta-analysis of differences in skewness, kurtosis, and correlation". You will see how to calculate the new effect size statistics we have proposed and how to use them in a meta-analytical model using the `metafor` package in `R`.# ContentIn this online material, we provide details on the simulations we ran to evaluate the effectiveness of our proposed effect size statistics (and their associated sampling error). We also show how to (1) calculate our newly proposed effect sizes ($\Delta sk$, $\Delta ku$, $\Delta Zr$) and (2) exemplify their use with data from the International Mouse Phenotyping Consortium.# Prerequisites## Loading packagesOur tutorial uses `R` statistical software and existing `R` packages, which you will first need to download and install.If the packages are archived in CRAN, use `install.packages()` to install them.For example, to install the `metafor` , you can execute `install.packages("metafor")` in the console (bottom left pane of `R Studio`).Version information of each package is listed at the end of this tutorial.```{r packages}if (!require("pacman")) {install.packages("pacman")}pacman::p_load(corrr, DT, ggdist, ggtext, here, janitor, latex2exp, magick, metafor, orchaRd, pander, patchwork, tidyverse)options(DT.options =list(rownames =FALSE,dom ="Blfrtip",scrollX =TRUE,pageLength =5,columnDefs =list(list(targets ='_all', className ='dt-center')),buttons =c('copy', 'csv', 'excel', 'pdf')))source("layout.R")```## Custom functionsWe also provide some additional helper functions to calculate effect sizes, process data, and visualise our results.The most straightforward way to use these custom functions is to run the code chunk below.Alternatively, paste the code into the console and hit `Enter` to have `R` ‘learn’ these custom functions.If you want to use these custom functions in your own data, you will need to change the variable names according to your own data (check out the `R` code and you will see what we mean).```{r}#| code-fold: true# calculate established effect sizes ----calc.effect <-function(data = raw_data, m) { # calculates other already established effect size statistics escalc(measure = m,m1i = data$mean_male,m2i = data$mean_female,sd1i = data$sd_male,sd2i = data$sd_female,n1i = data$n_male,n2i = data$n_female,var.names =c(paste0(m,"_est"),paste0(m,"_var")))}# processing functions ----process.ind_effects <-function(chosen_trait ="fat_mass",measure ="KU_delta",output ="results") { ind_effects <- df_meta_analysed %>%filter(trait_name == chosen_trait, phenotyping_center %in%c("CCP-IMG","HMGU","JAX","MRC H","TCP")) %>%mutate(type ="individual") %>%select(phenotyping_center, strain_fig,n = n_total,est =all_of(paste0(measure, "_","est")),var =all_of(paste0(measure, "_","var")),lower =all_of(paste0(measure, "_","lower")),upper =all_of(paste0(measure, "_","upper"))) %>%rowid_to_column("effect_size_id") model <-rma.mv(data = ind_effects,yi = est,V = var,test ="t",random =list(~1|effect_size_id,~1|phenotyping_center, ~1|strain_fig))if (output =="results") { df_model <-data.frame(trait_name = chosen_trait,est = model$beta[1],var = model$se ^2,lower = model$ci.lb,upper = model$ci.ub,phenotyping_center ="Mean",strain_fig ="ES") ind_effects %>%bind_rows(df_model) %>%mutate(est_type = measure,centre_and_strain =factor(paste0(phenotyping_center,"\n", strain_fig))) %>%mutate(centre_and_strain =factor(centre_and_strain,levels =c("Mean\nES",rev(levels(centre_and_strain)[-6])))) } elseif (output =="heterogeneity") {tibble(trait_name_1 = chosen_trait,trait_name_2 =NA,est_type = measure) %>%add_column(as_tibble(t(i2_ml(model)))) %>%clean_names() }}process.cor_effects <-function(chosen_trait_1 ="fat_mass",chosen_trait_2 ="heart_weight",output ="results") { df_raw_cor <- df_raw %>%filter(trait_name %in%c(chosen_trait_1, chosen_trait_2), phenotyping_center %in%c("CCP-IMG","HMGU","JAX","MRC H","TCP")) %>%pivot_wider(id_cols =c(specimen_id, strain_fig, phenotyping_center, sex),names_from = trait_name) %>%clean_names() %>%drop_na() %>%group_by(strain_fig, phenotyping_center, sex) %>%group_modify(~correlate(.x)) %>%drop_na(all_of(chosen_trait_2)) %>%ungroup() %>%left_join(df_raw %>%filter(trait_name %in%c(chosen_trait_1, chosen_trait_2), phenotyping_center %in%c("CCP-IMG","HMGU","JAX","MRC H","TCP")) %>%pivot_wider(id_cols =c(specimen_id, strain_fig, phenotyping_center, sex),names_from = trait_name) %>%clean_names() %>%drop_na() %>%group_by(strain_fig, phenotyping_center, sex) %>%summarise(n =n())) %>%select(-all_of(chosen_trait_1)) %>%rename(r_est = chosen_trait_2) %>%pivot_wider(id_cols =c(strain_fig, phenotyping_center),names_from = sex,values_from =c(r_est, n)) df_effects_cor <- df_raw_cor %>%left_join(df_raw_cor %>%group_by(phenotyping_center, strain_fig) %>%reframe(orchaRd::cor_diff(cor1 =na.omit(r_est_male),cor2 =na.omit(r_est_female),n1 =na.omit(n_male),n2 =na.omit(n_female)))) %>%rename(delta_zr_est = zr_diff,delta_zr_var = var_zr_diff) %>%mutate(delta_zr_upper = delta_zr_est +qt(0.975, n_male + n_female -2) *sqrt(delta_zr_var),delta_zr_lower = delta_zr_est -qt(0.975, n_male + n_female -2) *sqrt(delta_zr_var)) %>%rowid_to_column("effect_size_id") model <-rma.mv(data = df_effects_cor,yi = delta_zr_est,V = delta_zr_var,test ="t",random =list(~1|effect_size_id,~1|phenotyping_center, ~1|strain_fig))if (output =="results") { df_model <-data.frame(delta_zr_est = model$beta[1],delta_zr_lower = model$ci.lb,delta_zr_upper = model$ci.ub,phenotyping_center ="Mean",strain_fig ="ES") df_effects_cor %>%bind_rows(df_model) %>%mutate(centre_and_strain =factor(paste0(phenotyping_center,"\n", strain_fig))) %>%mutate(centre_and_strain =factor(centre_and_strain,levels =c("Mean\nES",rev(levels(centre_and_strain)[-5])))) } elseif (output =="heterogeneity") {tibble(trait_name_1 = chosen_trait_1,trait_name_2 = chosen_trait_2,est_type ="delta_zr") %>%add_column(as_tibble(t(i2_ml(model)))) %>%clean_names() } }# visualisation functions ----caterpillar.custom <-function(chosen_trait ="fat_mass",measure ="KU_delta") { plot <-process.ind_effects(chosen_trait = chosen_trait,measure = measure) %>%ggplot(aes(y = centre_and_strain,x = est,xmax = upper,xmin = lower,shape = strain_fig,col = phenotyping_center)) +geom_pointrange() +geom_vline(xintercept =0,linetype ="dotted") +theme_classic() +theme(legend.position ="none",axis.text.y =element_blank(),axis.title.y =element_blank(),plot.tag.position =c(0.15, 0.98))if (measure =="ROM") { plot +labs(x ="lnRR") +scale_x_continuous(limits =c(-0.51, 0.51),breaks =c(-0.5, 0, 0.5)) +theme(axis.title.x = ggtext::element_markdown(face ="italic")) } elseif (measure =="VR") { plot +labs(x ="lnVR") +scale_x_continuous(limits =c(-1, 1),breaks =c(-1, 0, 1)) +theme(axis.title.x = ggtext::element_markdown(face ="italic")) } elseif (measure =="SK_delta") { plot +labs(x ="Δ*sk*") +scale_x_continuous(limits =c(-2.1, 2.1),breaks =c(-2,0, 2)) +theme(axis.title.x = ggtext::element_markdown()) } elseif (measure =="KU_delta") { plot +labs(x ="Δ*ku*") +scale_x_continuous(limits =c(-15, 15),breaks =c(-15, 0, 15)) +theme(axis.title.x = ggtext::element_markdown()) } }ridgeline.custom <-function(chosen_trait ="fat_mass") { processed_data <- df_raw %>%filter(trait_name == chosen_trait, phenotyping_center %in%c("CCP-IMG","HMGU","JAX","MRC H","TCP")) %>%add_row(phenotyping_center ="Mean",strain_fig ="ES") %>%mutate(centre_and_strain =factor(paste0(phenotyping_center,"\n", strain_fig))) %>%mutate(centre_and_strain =factor(centre_and_strain,levels =c("Mean\nES",rev(levels(centre_and_strain)[-5]))),value_s =scale(value)[, 1]) sample_sizes <- processed_data %>%count(centre_and_strain, sex) %>%slice(-1) %>%add_row(centre_and_strain =factor("Mean\nES"), processed_data %>%filter(!is.na(sex)) %>%count(sex)) %>%pivot_wider(id_cols = centre_and_strain,names_from = sex,values_from = n) %>%mutate(label =paste0("Nf = ", female,", Nm = ", male)) %>%arrange(centre_and_strain) processed_data %>%ggplot(aes(x = value_s,y = centre_and_strain,fill = sex,linetype = sex)) +stat_slab(scale =0.7, alpha =0.4,linewidth =0.6,col ="black") +scale_fill_manual(values =c("white","black")) +scale_linetype_manual(values =c("solid","dashed")) +labs(x =paste0(str_to_sentence(str_replace_all(chosen_trait,"_"," ")),"\n(scaled)"),y ="Phenotyping centre and mice strain") +annotate(geom ="text",x =mean(range(processed_data$value_s,na.rm = T)),y =as.numeric(sample_sizes$centre_and_strain) -0.15,label = sample_sizes$label,size =2) +theme_classic() +theme(legend.position ="none",axis.title.x =element_text(size =12, margin =margin(t =0.2,unit ="cm")),axis.title.y =element_text(size =12, margin =margin(r =0.2,unit ="cm")),axis.text.x =element_text(size =10),axis.text.y =element_text(size =10),plot.tag.position =c(0.53, 0.98))}cor.caterpillar.custom <-function(chosen_trait_1 ="fat_mass",chosen_trait_2 ="heart_weight") {process.cor_effects(chosen_trait_1 = chosen_trait_1,chosen_trait_2 = chosen_trait_2) %>%ggplot(aes(y = centre_and_strain,x = delta_zr_est,xmax = delta_zr_upper,xmin = delta_zr_lower,shape = strain_fig,col = phenotyping_center)) +geom_pointrange() +geom_vline(xintercept =0,linetype ="dotted") +labs(y ="Phenotyping centre and mice strain",x ="Δ*Zr*", shape ="Strain") +scale_x_continuous(limits =c(-1, 1),breaks =c(-1, 0, 1)) +theme_classic() +theme(legend.position ="none",axis.title.x = ggtext::element_markdown(size =12, margin =margin(t =0.2,unit ="cm")),axis.title.y =element_text(size =12,margin =margin(r =-0.1,unit ="cm")),axis.text.x =element_text(size =10),axis.text.y =element_text(size =10),plot.tag.position =c(0.3, 0.99)) }cor.plot.custom <-function(chosen_trait_1 ="fat_mass",chosen_trait_2 ="heart_weight",chosen_lims =c(-3, 5)) { df_cor <- df_raw %>%filter(trait_name %in%c(chosen_trait_1, chosen_trait_2), phenotyping_center %in%c("CCP-IMG","HMGU","JAX","MRC H","TCP")) %>%pivot_wider(id_cols =c(specimen_id, strain_fig, phenotyping_center, sex),names_from = trait_name) %>%clean_names() %>%drop_na() %>%mutate(centre_and_strain =factor(paste0(phenotyping_center, strain_fig))) %>%mutate(centre_and_strain =factor(centre_and_strain,levels =rev(levels(centre_and_strain))),trait_1_s =scale(get(chosen_trait_1))[,1],trait_2_s =scale(get(chosen_trait_2))[,1]) sample_sizes <- df_cor %>%count(centre_and_strain, sex) %>%pivot_wider(id_cols = centre_and_strain,names_from = sex,values_from = n) %>%mutate(label =paste0("Nf = ", female,", Nm = ", male)) %>%arrange(desc(centre_and_strain)) plot_list <-list()for (i in1:length(levels(df_cor$centre_and_strain))) { level_i <-sort(levels(df_cor$centre_and_strain))[i] plot <- df_cor %>%filter(centre_and_strain == level_i) %>%ggplot(aes(x = trait_1_s,y = trait_2_s,shape = sex,linetype = sex)) +geom_point(alpha =0.008) +geom_abline(intercept =0,slope =1,linewidth =0.5,linetype ="dotted") +geom_smooth(method ="lm",se = F,col ="black") +scale_shape_manual(values =c(3, 4)) +scale_linetype_manual(values =c("solid","dashed")) +scale_x_continuous(limits = chosen_lims) +scale_y_continuous(limits = chosen_lims) +labs(x =paste0(str_to_sentence(str_replace_all(chosen_trait_1, "_", " ")),"\n(scaled)"),y =paste0(str_to_sentence(str_replace_all(chosen_trait_2, "_", " "))," (scaled)")) +annotate(geom ="text",x =mean(chosen_lims),y =4.5,label = sample_sizes$label[i],size =2) +theme_classic() +theme(legend.position ="none",plot.tag.position =c(0.05, 0.91),axis.title.x =element_text(size =12, margin =margin(t =0.2,unit ="cm")),axis.title.y =element_text(size =12, margin =margin(r =0.2,unit ="cm")),axis.text.x =element_text(size =10),axis.text.y =element_text(size =10))if (i !=6) { plot <- plot +theme(axis.title.x =element_blank(),axis.text.x =element_blank(),axis.line.x =element_blank(),axis.ticks.x =element_blank()) } plot_list[[i]] <- plot }return(plot_list) }```# Simulations## Extended simulation detailsWe conducted Monte-Carlo simulations to evaluate bias and variance estimation for our new effect sizes $\Delta sk$, $\Delta ku$, and $\Delta Zr$. For $\Delta sk$ and $\Delta ku$ we simulated independent samples for two groups from Pearson distributions with known moments using the *rpearson* function from the PearsonDS R package [vers. 1.3.2, @PearsonDS]. We conducted two simulations: 1) first by changing skewness between groups that involved moderate departures from normality (group-specific skewness, $sk \in \{-1, -0.5, 0, 0.5, 1\}$ with kurtosis fixed at 3) and 2) by holding skewness constant ($sk$ = 0) while manipulating kurtosis, $ku \in \{2.5, 3, 4, 5, 6\}$. In all cases, we simulated scenarios where: (i) the variance between each group was the same ($\sigma^2_{2}$ = $\sigma^2_{1}$ = 1) or different ($2\sigma^2_{2}$ versus $\sigma^2_{1}$); (ii) the mean between the two groups was the same ($\mu_{2}$ = $\mu_{1}$ = 0) or different ($\mu_{2}$ = 5, $\mu_{1}$ = 0). For simplicity we assumed equal sample sizes between groups with sample size varying from $n \in \{10, 20, \dots, 100, 150, 500\}$. We created all unique combinations of the above scenarios resulting in 1200 independent scenarios (when considering each of the 100 scenarios at each sample size, see examples in @sec-skewness and @sec-kurtosis). We estimated $\Delta sk$ and $\Delta ku$ for each scenario using formulas for within-group sample skewness with small-sample correction (Eq. 1 in main manuscript) and excess kurtosis with small-sample correction (Eq. 3 in main manuscript) to estimate point estimates. To estimate associated sampling variance for $\Delta sk$ and $\Delta ku$ we used the analytical variance estimators derived here and an associated re-sampling (jackknife) approach to compute group sampling variances separately followed by pooling. Importantly, our simulations assume no correlation between groups. For $\Delta Zr$ simulations, we simulated two groups each containing two variables with known correlations within each group. For $\Delta Zr$ we drew bivariate normal data with target within-group correlations $r \in \{-0.8, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8\}$ using the *mvnorm* function in the MASS package of R [vers. 7.3.61, @MASS]. Marginals were standard normal and group sizes varied from $n \in \{10, 20, \dots, 100, 150, 500\}$. Again, we created all unique combinations of scenarios resulting in 768 unique scenarios. We estimated $\Delta Zr$ using Fisher’s Z transformation $Zr$ and calculating $\Delta Zr$ as the difference of $Zr$ across groups (Eqs. 9–11 in main manuscript). Sampling variance for $\Delta Zr$ was approximated used the standard analytic variance $\frac{1}{n-3}$ per group (summed; Eq. 10 in main manuscript) and a jackknife approach. Again, we assumed no correlation between our groups. Across all simulations we conducted 2,500 replicates of each scenario. Performance metrics were (a) bias of the point estimator (mean estimate minus truth) (@eq-bias), (b) relative bias of the sampling-variance estimator (@eq-relbias), (c) coverage (95%) and (d) Monte-Carlo standard errors (MCSEs) (@eq-mcse). $$\text{Bias}(\hat{\theta}_{mu}) = \frac{\sum_{i=1}^{n_{\text{sim}}} \hat{\theta}_i}{n_{\text{sim}}} - \theta$$ {#eq-bias}where $\theta_{mu}$ is the true overall mean effect (i.e., the true $\Delta sk$, $\Delta ku$, and $\Delta Zr$ between groups), and $\hat{\theta}_i$ is the effect estimate from simulation $i$. We calcuated coverage as the proportion of 95% confidence intervals (CIs) that contained the true effect size across all simulations for each scenario. To understand bias in the sampling variance, we computed the relative bias for the effect measure as:$$\text{Relative Bias}(\hat{\theta}_{SV}) = \left( \frac{\mathbb{E}[\hat{SV}] - \hat{\theta}^2}{\hat{\theta}^2}\right) \cdot 100\% $$ {#eq-relbias}where $\hat{\theta}^2$ is the estimated sampling variance for the effect measure and $\mathbb{E}[\hat{SV}]$ is the expected value of the sampling variance. Relative bias can be calculated using different combinations of estimates. For example, we can use $\hat{\theta}^2$ as the demoninator from a point estimate taken from the small sample correction for $\Delta sk$, $\Delta ku$, or $\Delta Zr$ or from the jackknife estimate. Alternatively, we can use the jackknife estimate for $\mathbb{E}[\hat{SV}]$ across each simulation. We created each of the four combinations for our performance measures. For each of our performance measures we also computed the Monte Carlo Standard Errors (MCSE) of the estimated bias with$$\text{MCSE} = \sqrt{\frac{S^2_{\hat{\theta}}}{n_{\text{sim}}}}$$ {#eq-mcse}where $S^2_{\hat{\theta}}$ is the sample variance of the estimated effects across simulations and $n_{\text{sim}}$ is the number of simulations.## Extended simulation results```{r, sim_results}#| label: sim_results#| echo: false#| message: false# Load in the simulation resultssk <-readRDS("./output/result_skewness.rds")ku <-readRDS("./output/result_kurt.rds")cor <-readRDS("./output/result_cor.rds")# Summarise the means for all mcse columnssk_mcse <- sk %>%summarise(across(starts_with("mcse"),function(x) mean(x, na.rm =TRUE)))ku_mcse <- ku %>%summarise(across(starts_with("mcse"),function(x) mean(x,na.rm =TRUE)))cor_mcse <- cor %>%summarise(across(starts_with("mcse"),function(x) mean(x, na.rm =TRUE))) # Lets find out what is the best estimator. Mean Squared Bias (MSB)result_kurt_least_bias <- ku %>%summarise(across(starts_with("bias_"),~sum(.x^2,na.rm =TRUE),.names ="ms_bias_{col}")) %>%t() %>%data.frame() %>%arrange() colnames(result_kurt_least_bias) <-"MSBE"result_skew_least_bias <- sk %>%summarise(across(starts_with("bias_"),~sum(.x^2,na.rm =TRUE),.names ="ms_bias_{col}")) %>%t() %>%data.frame() %>%arrange() colnames(result_skew_least_bias) <-"MSBE"result_cor_least_bias <- cor %>%summarise(across(starts_with("bias_"),~sum(.x^2,na.rm =TRUE),.names ="ms_bias_{col}")) %>%t() %>%data.frame() %>%arrange() colnames(result_cor_least_bias) <-"MSBE"# What is the bias for skewness where it's 0, or normally distributedsk_normal <- sk %>%filter(skewness_g1 ==0& skewness_g2 ==0) %>%summarise(mean_bias_sk =mean(bias_sk),sd_bias_sk =sd(bias_sk),mean_bias_jack =mean(bias_sk_jack_bc),sd_bias_jack =sd(bias_sk_jack_bc),n =n())# What is the bias for kurtosis where it's 3, or normally distributedku_normal <- ku %>%filter(kurtosis_g1 ==3& kurtosis_g2 ==3) %>%summarise(mean_bias_ku =mean(bias_ku),sd_bias_ku =sd(bias_ku),mean_bias_jack =mean(bias_ku_jack_bc),sd_bias_jack =sd(bias_ku_jack_bc),n =n())# Lets create a new columns describing the average deviation of both distributions from nomrality. sk <- sk %>%mutate(dev_skew =abs((skewness_g1 -0) + (skewness_g2 -0)))ku <- ku %>%mutate(dev_kurt =abs((kurtosis_g1 -3) + (kurtosis_g2 -3)),var_diff =as.factor(abs((variance_g1 - variance_g2))),mean_diff =as.factor(abs((mean_g1 - mean_g2))))# Plotssk_why <-ggplot(sk,aes(y = bias_sk,x = dev_skew,fill = n,colour = n)) +geom_point(size =2) +geom_hline(yintercept =0, linetype ="dashed",color ="black") +labs(y =TeX("$Bias_{\\Delta \\textit{sk}}$ (small-sample estimator)"), x =TeX("Absolute deviation from normality $ \\left( \\left| ( \\textit{sk}_{g1} - 0 ) + (\\textit{sk}_{g2} - 0 ) \\right| \\right)$"), fill ="Sample size",colour ="Sample size") +scale_fill_viridis_c() +scale_colour_viridis_c() +theme_classic()ku_why <-ggplot(ku,aes(y = bias_ku,x = dev_kurt,fill = n,colour = n)) +geom_point(size =2) +geom_hline(yintercept =0,linetype ="dashed",color ="black") +labs(y =TeX("$Bias_{\\Delta \\textit{ku}}$ (small-sample estimator)"), x =TeX("Absolute deviation from normality $ \\left( \\left| (\\textit{ku}_{g1} - 3 ) + (\\textit{ku}_{g2} - 3 ) \\right| \\right)$"), fill ="Sample Size",colour ="Sample Size") +theme_classic() +scale_fill_viridis_c() +scale_colour_viridis_c()plot_why <- sk_why + ku_why +plot_layout(nrow =2) +plot_annotation(tag_levels ="A")# ggsave("./output/figs/plot_why.png",# plot = plot_why,# bg = "white",# dpi = 600,# width = 7,# height = 7,# units = "in")```In all cases, we found the Monte Carlo Sampling Error (MCSEs) to be low for all our performance metrics ($\Delta sk$: range of MCSEs, `r round(min(sk_mcse), digits = 3)` to `r round(max(sk_mcse), digits = 3)`; $\Delta ku$: range of MCSEs, `r round(min(ku_mcse), digits = 3)` to `r round(max(ku_mcse), digits = 3)`; $\Delta Zr$: range of MCSEs, `r round(min(cor_mcse), digits = 3)` to `r round(max(cor_mcse), digits = 3)`). #### Evaluating $\Delta sk$, $\Delta ku$, and $\Delta Zr$ point estimatorsAcross all simulations, $\Delta sk$, $\Delta ku$, and $\Delta Zr$ point estimators exhibited small sample bias with less than 20-30 samples, except for $\Delta ku$, which continued to show this bias through to n ~ 50-60, indicating effect sizes involving kurtosis are more challenging to estimate (@fig-bias). Regardless, small sample biases were moderate, and there was rarely a consistent over or under-estimation in point estimates across the scenarios evaluated (@fig-bias). As expected, both the analytical and jackknife estimates for $\Delta sk$ and $\Delta ku$ had minimal bias when distributions were normally distributed (Skewness of both groups = 0; Analytical bias= `r round(sk_normal$mean_bias_sk, digits = 3)` with SD = `r round(sk_normal$sd_bias_sk, digits = 3)` & Jackknife = `r round(sk_normal$mean_bias_jack, digits = 3)` with SD = `r round(sk_normal$sd_bias_jack, digits = 3)`; Kurtosis of both groups = 3; Analytical bias=`r round(ku_normal$mean_bias_ku, digits = 3)` with SD = `r round(ku_normal$sd_bias_ku, digits = 3)` & Jackknife = `r round(ku_normal$mean_bias_jack, digits = 3)` with SD = `r round(ku_normal$sd_bias_jack, digits = 3)`), but bias in both seemed to change in complex ways that suggest challenges in estimating moments (@fig-plot_why). This did not appear to be related to differences in group means or variances as bias patterns were similar across these conditions (@fig-mean_var), but rather the sampling distributions being skewed (see "Coverage" below).##### Fig. S1```{r}#| label: fig-bias#| echo: false #| fig.cap: Bias in $\Delta sk$, $\Delta ku$ and $\Delta Zr$ effect estimates across simulations where samples ranged in group sample sizes between $n \in \{10, 20, \dots, 100, 150, 500\}$. A total of 100 simulated scenarios were assessed for $\Delta sk$ and $\Delta ku$ whereas 64 simulated scenarios were assessed for $\Delta Zr$.For each scenario we ran 2,500 simulations.fig_bias <-image_read("./output/figs/bias.png")fig_bias```Importantly, bias-corrected jackknife estimates reduced the small-sample bias relative to analytical bias corrected-moment estimators (@fig-bias; $\Delta sk$ Mean Square Bias (MSB): Jackknife: `r round(result_skew_least_bias$MSBE[rownames(result_skew_least_bias)=="ms_bias_bias_sk_jack_bc"], digits = 3)`, Analytical: `r round(result_skew_least_bias$MSBE[rownames(result_skew_least_bias) =="ms_bias_bias_sk"], digits = 3)`; $\Delta ku$ Mean Square Bias (MSB): Jackknife: `r round(result_kurt_least_bias$MSBE[rownames(result_kurt_least_bias) =="ms_bias_bias_ku_jack_bc"], digits = 3)`, Analytical: `r round(result_kurt_least_bias$MSBE[rownames(result_kurt_least_bias) =="ms_bias_bias_ku"], digits = 3)`; $\Delta cor$ Mean Square Bias (MSB): Jackknife: `r round(result_cor_least_bias$MSBE[rownames(result_cor_least_bias)=="ms_bias_bias_jack_d_cor"], digits = 3)`, Analytical: `r round(result_cor_least_bias$MSBE[rownames(result_cor_least_bias)=="ms_bias_bias_d_cor"], digits = 3)`). ##### Fig. S2```{r}#| label: fig-plot_why#| echo: false#| fig.cap: Bias of analytical point estimators in relation to the absolute difference in skewness and kurtosis between groups. A) skewness and b) kurtosis. Colour of points correspond to the sample size and each point is a single simulated scenario. The dotted line is the zero bias line.plot_why <-image_read("./output/figs/plot_why.png")plot_why```##### Fig. S3```{r}#| label: fig-mean_var#| echo: false #| fig.cap: Bias for $\Delta sk$ and $\Delta ku$ for simulated scenarios was not related to group means or variances being different. For each scenario we ran 2,500 simulations.mean_var <-image_read("./output/figs/combined_mean_var.png")mean_var```#### Evaluating sampling variance estimators for $\Delta sk$, $\Delta ku$, and $\Delta Zr$In contrast to point estimators, the effectiveness of sampling variance estimators for $\Delta sk$, $\Delta ku$, and $\Delta Zr$ varied. Analytical sampling variance formulas for $\Delta sk$ and $\Delta ku$ were consistently biased (@fig-relbias). Jackknife resampling, when combined with analytical point estimates ([@fig-relbias]D & [@fig-relbias]E), improved performance. Under these conditions, estimators performed well when n > 50. In contrast, the performance of sampling variance estimators for $\Delta Zr$ was best when using the analytical formulas for both the point estimator and it's associated sampling variance ([@fig-relbias]I).##### Fig. S4```{r}#| label: fig-relbias#| echo: false #| fig.cap: Relative Bias in $\Delta sk$, $\Delta ku$ and $\Delta Zr$ effect estimates across simulations where samples ranged in group sample sizes between $n \in \{10, 20, \dots, 100, 150, 500\}$. A total of 100 simulated scenarios were assessed for $\Delta sk$ and $\Delta ku$ whereas 64 simulated scenarios were assessed for $\Delta Zr$. Note that for relative bias different combinations of point estimates and sampling variance estimates were used in their calculation as indicated in their titles which show the calculation. Notation is as follows ku and sk are the skewness and kurtosis calculated using original formulas. sk_sv and ku_sv are the sampling variance estimates using the original formulas. jack_skew_sv and jack_ku_sv are the sampling variance estimates for skewness and kurtosis using jackknife. jack_skew_bc and jack_ku_bc are the bias corrected point estimates from the jackknife. For each scenario we ran 2,500 simulations.fig_relbias <-image_read("./output/figs/relativebias.png")fig_relbias```#### Coverage of 95% confidence intervalsCoverage was generally close to nominal levels (95%) for $\Delta sk$ and $\Delta Zr$, but slighly poorer when sample sizes were small (< n = 20) (@fig-coverage). For $\Delta Zr$, coverage was close to nominal levels across all sample sizes when using the analytical sampling variance estimator (@fig-coverage). However, coverage was poor for $\Delta ku$ across many scenarios no matter what estimator was used (@fig-coverage). Unintuitively, coverage for $\Delta ku$ was often worse as sample sizes were larger (@fig-coverage), driven by the fact that point estimates were more accurately estimated with large sample sizes, but the sampling distribution became highly skewed, impacting coverae (@fig-sampledist). At small sample sizes $\Delta ku$ was estimated poorly when true $\Delta ku$ was high, leading to non-skewed distributions with good coverage. In contrast, large sample sizes improved point estimation of $\Delta ku$ when true differences existed, but the sampling distribution became highly skewed leading to poor coverage (@fig-sampledist). ##### Fig. S5```{r}#| label: fig-coverage#| echo: false #| fig.cap: Coverage of 95% confidence intervals for $\Delta sk$, $\Delta ku$ and $\Delta Zr$ effect estimates across simulations where samples ranged in group sample sizes between $n \in \{10, 20, \dots, 100, 150, 500\}$. A total of 100 simulated scenarios were assessed for $\Delta sk$ and $\Delta ku$ whereas 64 simulated scenarios were assessed for $\Delta Zr$. For each scenario we ran 2,500 simulations. fig_coverage <-image_read("./output/figs/coverage.png")fig_coverage```##### Fig. S6```{r}#| label: fig-sampledist#| echo: false #| fig.cap: Example sampling distributions of three different scenerios ($\Delta ku$ = 0, 1 or 2.5) for n = 10 and n = 500 samples for each group. For each scenario we ran 2,500 simulations. fig_sampledist <-image_read("./output/figs/FigS_plos_samplingdist.png")fig_sampledist```## SummaryIn light of these simulation results, we suggest pairing the formula-based point estimators for skewness (Eq. 1 in main manuscript) and kurtosis (Eq. 3 in main manuscript) with jackknife standard errors for $\Delta sk$ and $\Delta ku$. For $\Delta Zr$, the standard analytic variance is recommended (Eqns. 9-12 in main manuscript). This choice balances efficiency under normality with robustness to realistic deviations from it, and aligns with our broader guidance to avoid very small group sizes for these statistics. Given the challenges in estimating $\Delta ku$, and the poor properties of its sampling variance, we recommend weighted meta-analytic models using sample size instead of sampling variance. # Equations and custom functions to calculate effect sizes## SkewnessFollowing Joanes & Gill (1998). $$sk = \frac{\frac{1}{n} \sum_{i = 1}^{n}(x_{i} - \bar{x}) ^ 3}{[\frac{1}{n} \sum_{i = 1}^{n}(x - \bar{x}) ^ 2] ^ \frac{3}{2}}\frac{\sqrt{n (n - 1)}}{n - 2}$$$$s^2_{sk} = \frac{6n(n - 1)}{(n - 2)(n + 1)(n + 3)}$$$$\Delta sk = sk_{1} - sk_{2}$$$$s^2_{\Delta sk} = s^2_{sk_1} + s^2_{sk_2} - 2 \rho_{sk} s_{sk_1} s_{sk_2}$$## Kurtosis Following Joanes & Gill (1998). $$ku = \frac{n (n + 1) (n - 1)}{(n - 2)(n - 3)} \frac{\sum_{i = 1}^{n}(x_{i} - \bar{x}) ^ 4}{[\sum_{i = 1}^{n}(x_{i} - \bar{x}) ^ 2]^ 2} -\frac{3(n - 1) ^ 2}{(n - 2)(n - 3)}$$$$s^2_{ku} = \frac{24 n (n - 1) ^ 2}{(n - 3)(n - 2)(n + 3)(n + 5)} $$$$\Delta ku = ku_{1} - ku_{2}$$$$s^2_{\Delta ku} = s^2_{ku_1} + s^2_{ku_2} - 2 \rho_{ku} s_{ku_1} s_{ku_2}$$## ZrFollowing Hedges & Olkin (1985). $$Zr = \frac{ln(\frac{1 + r}{1 - r})}{2}$$$$s^2_{Zr} = \frac{1}{n - 3}$$$$\Delta Zr = Zr_{1} - Zr_{2}$$$$s^2_{\Delta Zr} = s^2_{Zr_1} + s^2_{Zr_2} -2 \rho_{Zr} s_{Zr_1} s_{Zr_2}$$# Data loading and preparationWe use data from the International Mouse Phenotyping Consortium (IMPC, version 18.0; Dickinson et al., 2016; http://www.mousephenotype.org/).```{r data}# raw data ----df_raw <-read_csv("mice_data_sample.csv") %>%# small adjustments to make plots more readable:mutate(phenotyping_center =ifelse(phenotyping_center =="MRC Harwell","MRC H", phenotyping_center),strain_fig =case_when(strain_accession_id =="MGI:2159965"~"N", strain_accession_id =="MGI:2683688"~"NCrl", strain_accession_id =="MGI:2164831"~"NTac", strain_accession_id =="MGI:3056279"~"NJ", strain_accession_id =="MGI:2160139"~"NJcl"))df_raw_wide <- df_raw %>%select(- strain_accession_id) %>%pivot_wider(id_cols =c(specimen_id, trait_name, phenotyping_center, strain_fig),names_from = sex,values_from = value)df_meta_analysed <-# takes 30s to a minute to run df_raw %>%group_by(sex, trait_name, phenotyping_center, strain_fig) %>%summarize(mean =mean(value,na.rm = T),sd =sd(value,na.rm = T),n =n()) %>%pivot_wider(id_cols =c(trait_name, phenotyping_center, strain_fig),names_from = sex,values_from =c(mean:n)) %>%bind_cols(calc.effect(., m ="ROM")) %>%# lnRRbind_cols(calc.effect(., m ="CVR")) %>%# lnCVRbind_cols(calc.effect(., m ="VR")) %>%# lnVRleft_join(df_raw_wide %>%group_by(trait_name, phenotyping_center, strain_fig) %>%reframe(orchaRd::moment_effects(x1 =na.omit(male),x2 =na.omit(female),type ="skew"))) %>%left_join(df_raw_wide %>%group_by(trait_name, phenotyping_center, strain_fig) %>%reframe(orchaRd::moment_effects(x1 =na.omit(male),x2 =na.omit(female),type ="kurt"))) %>%rename(SK_delta_est = d_skew,SK_delta_var = d_skew_v,KU_delta_est = d_kurt,KU_delta_var = d_kurt_v) %>%mutate(n_total = n_female + n_male,prop_females = n_female / (n_female + n_male)) %>%select(trait_name, phenotyping_center, strain_fig, n_total, prop_females, ROM_est, ROM_var, CVR_est, CVR_var, VR_est, VR_var, SK_delta_est, SK_delta_var, KU_delta_est, KU_delta_var) %>%mutate(ROM_upper = ROM_est +qt(0.975, n_total -1) *sqrt(ROM_var),ROM_lower = ROM_est -qt(0.975, n_total -1) *sqrt(ROM_var),CVR_upper = CVR_est +qt(0.975, n_total -1) *sqrt(CVR_var),CVR_lower = CVR_est -qt(0.975, n_total -1) *sqrt(CVR_var),VR_upper = VR_est +qt(0.975, n_total -1) *sqrt(VR_var),VR_lower = VR_est -qt(0.975, n_total -1) *sqrt(VR_var),SK_delta_upper = SK_delta_est +qt(0.975, n_total -1) *sqrt(SK_delta_var),SK_delta_lower = SK_delta_est -qt(0.975, n_total -1) *sqrt(SK_delta_var),KU_delta_upper = KU_delta_est +qt(0.975, n_total -1) *sqrt(KU_delta_var),KU_delta_lower = KU_delta_est -qt(0.975, n_total -1) *sqrt(KU_delta_var))```# Meta-analytical modelsWe then use the data from multiple phenotyping centres and mice strains to calculate average effect sizes ($\Delta sk$, $\Delta ku$, and $\Delta Zr$).## Single variable effect sizes```{r}map2_dfr(.x =rep(c("fat_mass","heart_weight","glucose","total_cholesterol"),each =4),.y =rep(c("ROM","VR","SK_delta","KU_delta"), 4),.f = process.ind_effects) %>%mutate(est_type =case_when(est_type =="ROM"~"lnRR", est_type =="VR"~"lnVR", est_type =="SK_delta"~"delta_sk", est_type =="KU_delta"~"delta_ku")) %>%datatable(.,extensions ="Buttons",rownames =FALSE)```## Correlational effect sizes```{r}map2_dfr(.x =c("fat_mass","glucose"),.y =c("heart_weight","total_cholesterol"),.f = process.cor_effects) %>%mutate(relationship =rep(c("fat mass and heart weight","glucose and total cholesterol"),each =7)) %>%relocate(relationship) %>%datatable(.,extensions ="Buttons",rownames =FALSE)```## Heterogeneity```{r}map2_dfr(.x =rep(c("fat_mass","heart_weight","glucose","total_cholesterol"),each =4),.y =rep(c("ROM","VR","SK_delta","KU_delta"), 4),.f =~process.ind_effects(chosen_trait = .x,measure = .y,output ="heterogeneity")) %>%mutate(est_type =case_when(est_type =="ROM"~"lnRR", est_type =="VR"~"lnVR", est_type =="SK_delta"~"delta_sk", est_type =="KU_delta"~"delta_ku")) %>%bind_rows(map2_dfr(.x =c("fat_mass","glucose"),.y =c("heart_weight","total_cholesterol"),.f =~process.cor_effects(chosen_trait_1 = .x,chosen_trait_2 = .y,output ="heterogeneity"))) %>%datatable(.,extensions ="Buttons",rownames =FALSE)```# Visualisations## Meta-analytic### Figure 3```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width = 12, fig.height = 12}list_figure_3 <-list()list_figure_3[[1]] <-ridgeline.custom("fat_mass")list_figure_3[2:5] <-map2(.x =rep("fat_mass",4),.y =c("ROM","VR","SK_delta","KU_delta"),.f = caterpillar.custom)list_figure_3[[6]] <-ridgeline.custom("heart_weight")list_figure_3[7:10] <-map2(.x =rep("heart_weight",4),.y =c("ROM","VR","SK_delta","KU_delta"),.f = caterpillar.custom)(figure_3 <- list_figure_3 %>%wrap_plots(ncol =5) +plot_annotation(tag_levels ="A"))# ggsave("./output/figs/figure_3.jpg",# figure_3,# bg = "white",# dpi = 600,# width = 7,# height = 7,# units = "in")```### Figure 4```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width = 12, fig.height = 12}list_figure_4 <-list()list_figure_4[1:6] <-cor.plot.custom(chosen_trait_1 ="fat_mass",chosen_trait_2 ="heart_weight")list_figure_4[[7]] <-cor.caterpillar.custom(chosen_trait_1 ="fat_mass",chosen_trait_2 ="heart_weight")list_figure_4[8:13] <-cor.plot.custom(chosen_trait_1 ="glucose",chosen_trait_2 ="total_cholesterol")list_figure_4[[14]] <-cor.caterpillar.custom(chosen_trait_1 ="glucose",chosen_trait_2 ="total_cholesterol")(figure_4 <- list_figure_4 %>%wrap_plots() +plot_layout(design = layout_2,heights =c(rep(1, 6),0.6),widths =c(rep(0.23, 2),0.02, rep(0.23, 2)),axes ="collect",guides ="collect") +plot_annotation(tag_levels =list(c("A", rep("", 5), "B","C",rep("", 5),"D"))))# ggsave("./output/figs/figure_4.jpg",# figure_4,# bg = "white",# dpi = 600,# width = 7,# height = 7,# units = "in")```### Figure 5```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width = 12, fig.height = 12}list_figure_5 <-list()list_figure_5[[1]] <-ridgeline.custom("glucose")list_figure_5[2:5] <-map2(.x =rep("glucose", 4),.y =c("ROM","VR","SK_delta","KU_delta"),.f = caterpillar.custom)list_figure_5[[6]] <-ridgeline.custom("total_cholesterol")list_figure_5[7:10] <-map2(.x =rep("total_cholesterol",4),.y =c("ROM","VR","SK_delta","KU_delta"),.f = caterpillar.custom)(figure_5 <- list_figure_5 %>%wrap_plots(ncol =5) +plot_annotation(tag_levels ="A"))# ggsave("./output/figs/figure_5.jpg",# figure_5,# bg = "white",# dpi = 600,# width = 7,# height = 7,# units = "in")```## Scenarios explored for differences in skewness {#sec-skewness}::: {.panel-tabset .tabset-pills .tabset-fade}### Fig. S4```{r}#| label: fig-skewness-1#| echo: false#| # Plot 1 code goes herefiles <-list.files("./output/figs/skewness/")images <-lapply(files, function(x) { magick::image_read(paste0("./output/figs/skewness/", x)) })images[[1]]```### Fig. S5```{r}#| label: fig-skewness-2#| echo: false# Plot 2 code goes hereimages[[2]]```### Fig. S6```{r}#| label: fig-skewness-3#| echo: false# Plot 3 code goes hereimages[[3]]```### Fig. S7```{r}#| label: fig-skewness-4#| echo: false# Plot 4 code goes hereimages[[4]]```### Fig. S8```{r}#| label: fig-skewness-5#| echo: false# Plot 5 code goes hereimages[[5]]```### Fig. S9```{r}#| label: fig-skewness-6#| echo: false# Plot 6 code goes hereimages[[6]]```### Fig. S10```{r}#| label: fig-skewness-7#| echo: false# Plot 7 code goes hereimages[[7]]```### Fig. S11```{r}#| label: fig-skewness-8#| echo: false# Plot 8 code goes hereimages[[8]]```### Fig. S12```{r}#| label: fig-skewness-9#| echo: false# Plot 9 code goes hereimages[[9]]```### Fig. S13```{r}#| label: fig-skewness-10#| echo: false# Plot 10 code goes hereimages[[10]]```:::## Scenarios explored for differences in kurtosis {#sec-kurtosis}::: {.panel-tabset .tabset-pills .tabset-fade}### Fig. S14```{r}#| label: fig-kurtosis-1#| echo: false#| # Plot 1 code goes herefiles <-list.files("./output/figs/kurtosis/")images <-lapply(files,function(x) { magick::image_read(paste0("./output/figs/kurtosis/", x)) })images[[1]]```### Fig. S15```{r}#| label: fig-kurtosis-2#| echo: false# Plot 2 code goes hereimages[[2]]```### Fig. S16```{r}#| label: fig-kurtosis-3#| echo: false# Plot 3 code goes hereimages[[3]]```### Fig. S17```{r}#| label: fig-kurtosis-4#| echo: false# Plot 4 code goes hereimages[[4]]```### Fig. S18```{r}#| label: fig-kurtosis-5#| echo: false# Plot 5 code goes hereimages[[5]]```### Fig. S19```{r}#| label: fig-kurtosis-6#| echo: false# Plot 6 code goes hereimages[[6]]```### Fig. S20```{r}#| label: fig-kurtosis-7#| echo: false# Plot 7 code goes hereimages[[7]]```### Fig. S21```{r}#| label: fig-kurtosis-8#| echo: false# Plot 8 code goes hereimages[[8]]```### Fig. S22```{r}#| label: fig-kurtosis-9#| echo: false# Plot 9 code goes hereimages[[9]]```### Fig. S23```{r}#| label: fig-kurtosis-10#| echo: false# Plot 10 code goes hereimages[[10]]```:::# Software and package versions ```{r versions}sessionInfo() %>%pander()```# References